April 1945 


LOMETRIC 


THE BIOMETRICS SECTION, AMERICAN STATISTICAL ASSOCIATION 
RL TS ALE TE PN TY TT TET TO ED 


SOME USES OF STATISTICAL METHODS IN PLANT BREEDING 


F. R. ImmMer 
Minnesota Agricultural Experiment Station 


The problem of the plant breeder is the 
production of new varieties that are more de- 
sirable than those grown previously on the 
farms in his state or region. His objective is 
to combine into a single variety, or hybrid, as 
many desirable characters as possible, includ- 
ing yield. Only very large differences in yield 
can be judged visually, so that replicated field 
trials are necessary for comparing quantita- 
tively the yields of various strains. 


When the number of varieties, or strains, 
to be tested is small, the method of randomized 
complete blocks (8, 9, 11, 17) is the most 
common experimental design. This design is 
applicable for any number of replications. 
Missing plots may be interpolated easily. If 
several plots of a variety are missing, the var- 
ety may be omitted from the analysis and the 
Temaining varieties compared without bias. 


When the number of varieties is large, the 
greater soil variation between the plots within 
the larger blocks increases the error of the 
experiment. To overcome this difficulty the 
lattice designs have been devised (7, 21, 
22, 23). 


In addition to the control of field variation 
by the lattice design, standard varieties are 
ften planted as checks between the blocks of 
he experiment. Frequent reference to these 
standards is highly desirable in observing dif- 
erences in characters other than yield. New 
varieties which are inferior to the standards in 
y important character are discarded. 


In the simple and triple lattices (7, 12) 
each complete replication of k* varieties is di- 
vided into smaller blocks of k& varieties each.\, 
Variations in yield due to the differences be- 
tween the smaller blocks in the productivity of 
the soil are removed from the error which 
otherwise would be used. The blocks of k 
plots are made the unit for error control in- 
stead of the replicates of * plots. The number 
of replications must be multiples of two for 
simple lattices and multiples of three for 
triple lattices. 


Lattice square designs (2, 4, 23) have 
some of the features of Latin squares (9) in 
that the / varieties are planted in the form of 
a k x k square in each replicate. Variations 
in soil fertility can then be eliminated both 
between rows and between columns in each 
square or replicate. The lattice squares most 
commonly used are those with %2(k+1) re- 
plications for 25, 49, 81 and 121 varieties, or 
with (k+1) replications for 9, 16, 25, 49 and 


64 varieties. 


Cubic lattice designs (21) will have blocks 
of k plots for-k® varieties and require three re- 
plications or multiples of this number. With 
these designs as many as 1,000 varieties can 
be compared without requiring large individual 
blocks. 


Balanced lattice designs (18, 22) of the 
most useful type are those for testing 9, 16, 
25, 49, 64, 81 or 121 varieties in blocks con- 
taining the square root of the number of 


varieties with &+1 replications of the k* 
varieties. These designs are said to be bal- 
anced since each pair of treatments occurs to- 
gether once in an incomplete block. All var- 
ieties are compared with equal precision. 


All of the above lattice designs may be 
analyzed legitimately as randomized complete 
blocks if desired (7, 22). If the soil varies 
less within the incomplete blocks than be- 
tween blocks within complete replications, the 
lattice designs will give a smaller experimental 
error than randomized complete blocks. The in- 
crease in efficiency through the use of the 
lattice designs will depend on the number of 
varieties to be tested and the nature of the soil 
variation. In 20 corn yield trials in Iowa, 
Cochran (3) found that triple lattices with 
three replications were somewhat more ac- 
curate than randomized blocks with five repli- 
cations. With lattice squares the increase 
in accuracy over randomized blocks repre- 
sented a saving of about one replication in 
six with 25 varieties, one in five with 49 or 81 
varieties and one in three with 121 varieties. 
Wellhausen (18) found that the average effi- 
ciency of 60 balanced lattice designs with 
nine varieties of corn in four replications was 
144 percent as compared .with randomized 
blocks. These were planted by ob a agents 
and farmers who made no special effort to 
select uniform soil. Twenty-six lattice squares 
with 25 and 49 varieties had an efficiency of 
140 percent relative to randomized blocks. 


The analysis of experiments in incomplete 
blocks depends upon their balance, which is 
lost if plots are missing. Methods are avail- 
able for replacing such missing values (2, 
5, 6) but when several plots or whole varieties 
are missing it is simpler to disregard the differ- 
ences between blocks within complete repli- 
cations. The missing values are then com- 
puted by the simpler methods for randomized 
complete blocks or some varieties are omitted 
entirely. Consequently, incomplete block de- 
signs cannot be used to full advantage where 
missing data occur frequently. While the 
results can always be analyzed as randomized 
complete blocks the complete replications may 
be larger than is desirable for the control 
of field heterogeneity. If missing plots or 
missing varieties are of frequent occurrence, 


14 


many plant breeders prefer to use several ran- 
domized block designs each with not over 
20 varieties rather than the larger lattices 
which include all varieties. One or more 
standard varieties are then included as checks 
in each randomized block experiment. 


Another design for comparing a larger 
number of varieties divides them into several 
groups, the same grouping being used in all 
replicates. The groups are randomized within 
replicates and the varieties are randomized 
within groups. One or more standard varieties 
are included in each group, even though they 
may not be used in computing the experi- 
mental errors. Such designs lead to two errors, 
one applicable for comparisons of varieties 
in the same group and another for comparisons 
of varieties in different groups. Frequently 
thesé two errors are markedly different. Such 
designs are likely to prove inferior in efficiency 
to the lattice designs. 


For varietal trials combined with such 
tests as different fertilizers, dates of seeding 
and methods of disease control, split plot 
designs are very useful (12, 19, 20). For ex- 
ample, in some South American countries the 
planting of corn or wheat may extend over a 
long period of time. Varietal comparisons 
are often combined with tests on date of 
seeding. The most practicable field arrange- 
ment is to lay out large plots at random 
within each replicate for the different dates 
of seeding and randomize the varieties in sub- 
plots within each large plot. Split plot de- 
signs lead to two or more separate errors, 
each applicable to the test of significance for 
certain treatments. These errors usually differ 
in precision and practicability frequently dic- 
tates which are to be the main plots with the 
larger error. In other cases, the experimenter 
can determine through the design of the 
experiment the treatments on which he is 
willing to sacrifice precision in exchange for 
a lower error in some other comparisons of 
more vital interest. 


In Minnesota a variety is not recommended 
to the farmers until it has been tested for 
three years in several parts of the state, and 
has been proved superior to existing varieties 
in one or more important characters and not 


leficient in any essential respect. Yield is 
onsidered only one of the important characters. 
"he constancy of yield may be determined 
rom the interaction of varieties by years and 
he uniformity of performance in different 
laces in the state may be tested through the 
nteraction of varieties by places as compared 
vith the residual error (12, 14, 24). Varieties 
re discarded as ruthlessly for susceptibility to 
mportant diseases, deficiencies in an important 
gronomic character or for a serious lack of 
jualities for commercial use as for low yield. 


The incidence of some diseases can best 
je expressed as a percentage of infected plants. 
ince the error of a percentage, 100)\/pq/n, 
lepends on the expected proportion p, dif- 
culties mey arise in an analysis of variance, 
ispecially if the range in the percentage is 
reat, it may be necessary to transform the 
ercentages into a form in which the errors 
re independent of the unit of measure. This 
3 accomplished by transforming the per- 
entages to an angle ¢, where the percent 


=100 sin*# as suggested by’ Bliss (1, 10). 
Such transformed data may be used in or- 
dinary analyses of variance. 


Correlation or regression methods are use- 
ful in studying the relations between charac- 
ters in varieties or strains at plant breeding 
nurseries. These methods may be combined 
with the analysis of variance to study by 
covariance (8, 9, 11, 17) the effect of diseases 
or of shriveling of the seed and other charac- 
ters upon yield. Many character interrelation- 
ships may be brought to light and this know- 
ledge used in the selection of desirable charac- 
ter combinations. 


Some characters are best expressed in 
categories. In studying their possible associa- 
tions the x’ test for independence has proved 
a very useful statistical tool (8, 17). 

Frequently, the mode of inheritance of 
new characters can be studied during the early 
segregating generations of a hybridization pro- ' 
gram. Genetic information can then be ob- 

(Continued on page 28) 


LITERATURE CITED 


Jour. Sci. 38: 9-12. 1938. 
Bliss, C. I. and R. B. Dearborn. 
in New England and Pennsylvania. 
Cochran, W. G 
on corn. 
Cochran, W. G 
318, 1943. 
Cornish, E. A. 
incomplete data. Coun. Sci. Ind. Res. ( 
Cornish, E. A. The recovery of inter-b 
pg Oper data. 2. Lattice Eauares. 
Cc. Ec 


Fisher, R. A. and F. Yates. 
search. Oliver and Boyd 
Goulden, C. H. 
Hayes, H. K. and F. R. 
New York, 1942. 
Imm 


1930, 


Edinburgh, 


adaptation. Jour. Amer. 


S R 
Mather, K. The measurement of linkage 
New York, Inc. 


1940 
Wellhausen, E. J. 
ir Jour. Amer. Soc. Agr 


Sci. 23: 108-145. 1933. 
Complex experiments. Sw 


Base eaten Mey ace tee ethos (See Re gee Wee ae ms let ee ete 


dimensional lattices. 
Yates, F. The recove 


» 


riments. Ann. Eugenics 10:317-325. 
3. ates, F. Lattice squares. 
4. Yates, F. and W 


28: 556-580. 1938. 


Cox, Gertrude, Robert ardt and W. G. Cochran. 
triple lattice experiments in corn varietal tests. Iowa A 
Fisher, R. A. 

9, 1944. 

Fisher, R. A. 


The accuracy of incom 
on, 35: 66 


Bliss, C. I. The transformation of percentages for use in the analysis of variance. Ohio 


The efficiency of lattice squares in corn selection tests 
nsyl\ Proceedings Amer. Soc. Hort. Sci. 41: 324-342. 
examination of the accuracy of lattice and lattice square experiments 
Iowa Ag Exp. Sta. Res. Bul. 289. 1941 

seme 


me additional lattice square designs. Iowa Agr. Exp. Sta. Res. Bul. 


1942. 


The recovery of the inter-block information in quasi-factorial designs with 
Aust.) Bul. 158, 1943. 

ock information in quasi-factorial designs with 

Coun. Sci. Ind. Res. (Aust.) Bul. 175, 1944. 


The analysis of lattice and 
. Exp. Sta. Bul. 281, 1940. 


Statistical methods for research workers, Oliver and Boyd, Edinburgh, Ed. 
The design of experiments.Oliver and Boyd, Edinburgh, Ed. 3. 


1942, 


Statistical eee ee agricultural and medical re- 
Methods of statistical analysis, John Wiley and Sons, Inc., New York, 1939. 
i er. Methods of plant breeding. McGraw-Hill Book Company, 


er, F, R. Formulae and tables for calculating linkage intensites. 
Immer,-F. R., H. K. Hayes and LeRoy Powers. 


Genetics 15: 81-98. 
ore er. determination of barley varietal 


1943. 
heredity. Chemical Publishing Company of 


Soc. Agron. 26: 403-419. 


., and M. T. Henderson. perae studies in barley, Genetics 28: 419-440, 


Snedecor, George W Statistical methods. Iowa State College Press, Ames, Iowa. Ed. 3. 
n pete rack designs in varietal trials in West 
The principles of orthogonality and confounding in replicated experiments. 
o pp. Jour. Royal Stat. Soc. 2: 181-247. 1935. 

ates, F. The recovery of inter-bloc ceetrae tie varietal 


Ann. Eugenics 9: 136-156. 1939. 
of a tae Sh ETS in balanced incomplete block ex- 


trials arranged in three 


Jour. Agri. Sci, 30: 672-687. 1940. 
Cochran, The analysis of groups of experiments. Jour. Agri. Sci. 


NOTES ON ANALYSIS OF EXPERIMENTS REPLICATED IN TIMI 


By H. G. Wim 
Forest Service, U. S. Dept. of Agriculture 


In experiments designed to test the effects 
of various treatments upon some factor such 
as crop yield, the investigator may find it de- 
sirable to repeat his studies for several years 
on the same study areas. The purpose of such 
replications in the time dimension is to find 
out whether changes in climate, or in other 
factors which vary with time, may exert any 
real influence upon the general effect of the 
treatments that are under scrutiny; whether the 
Magnitude of treatment effects fails to re- 
main the same in different years. 


Since the interaction of treatment effects 
with such time factors frequently has a sub- 
stantial bearing on the general applications 
of experimental results, the repetition of ex- 
periments over several seasons or other periods 
of time has become a common practice. In 
field application this is an easy process; 
but the analysis of results is not so simple, 
and some debate exists among investigators as 
to the most desirable method. Although ar- 
ticles on different aspects of this topic have 
appeared in British journals, (1, 2, 8) the 
general principles of this kind of analysis 
have not been generally available to American 
investigators. Hence it seems desirable that 
one research worker who has labored inde- 
pendently with this important and common 
problem submit his interpretation of these 
principles and an outline of the appropriate 
analytic method. 


As a typical example, we may consider 
a randomized block design containing four 
blocks of five plots with five treatments as- 
signed at random to the plots within each 
block; the treatments have been repeated with- 
out rerandomization on the same plots over a 
series of three years. As usual, it will be 
convenient to assume for the sake of our 
analysis that the four blocks and these three 
years represent a random sample drawn from 
a population of blocks that are generally simi- 
lar to ours, and from a longer series of 
years to which our results might be applied. 
This assumption will require that the results 


of this experiment be scrutinized with som 
conservatism. 

The data from an actual experiment base: 
on this design are presented in Table 1. I 
this particular case, the magnitudes of th 
variable studied (soil moisture deficits unde 
a forest) were not perceptibly affected by an 
cumulative influence of time as expressed b 
observations taken in successive years; no 
was such an influence to be expected durin 
the period of study. This characteristic seem 
to be desirable for the main purpose of ou 
present discussion, as it simplifies the explang 
tion of principles and the analysis of th 
data. In many experiments, however, treat 
ments such as fertilizers applied for a serie 
of years on the same areas may exert a system 
atic influence on crop yields or other factor 
being tested. In these cases the yields ma 
exhibit variation which is correlated with tim 
as well as random variation. The presence a 
such systematic variation does not alter th 
general principles of our analysis, but i 
does require an extra analytic step. 

Since the soil moisture experiment was re 
peated with the same assignment of treai 
ments to the plots each year, its structure i 
analogous to that of a split plot design i 
which the three “splits,” or years, within eac 
plot exhibit variations associated only wit 
time. Hence the form of analysis is simpl 
that of any split plot experiment (7, pp. 72-75) 
as displayed in the left half of Table 2. 

In this type of analysis the total sum ¢ 
squares is divided into two main portions: 

1. The sum of squares “between 
plots,” with 19 degrees of freedom in 
our example. This portion is fur- 
ther subdivided into the typical 
analysis of a randomized block ex- 
periment (4), in which the several 
sums of squares are calculated from 


the three year sums obtained for each 
of the 20 plots. 


2. The sum of squares “within 
plots,” which is calculated from the 
data for individual years within each 
plot and is subdivided to provide all — 
the variances associated with years. 


16 


As in any split plot experiment, each of whole analysis. In the first part, for example, 
he two major parts of this analysis is the treatment mean square may be compared 
haracterized by its own “experimental error” with that calculated for the interaction of treat- 
nean square, which may be employed to test ments with blocks (place error, Table 2), 
he real nature, or significance, of the com- which provides an estimate of the failure 
arisons contained within that part of the of the treatments to behave alike from place to 


Table 1. Soil Moisture Deficits as Affected by Timber Cutting 
(All data in inches of water) 


SS 


* Treatment 
eg Block Treatment 
Year A B € D sums 


Uncut (11,900) 
1941 


2. 0.98 1.38 jn y/ 6.13 
1942 3.32 1.91 2.36 1.62 9.21 
1943 2.59 1.44 1.66 1.75 7.44 
ce a a a ae pe a a es 
Sum 8.31 4.33 5.40 4.74 22.78 
cS SS SS eee 
6,000 ’ 
1941 1.76 1.65 1.69 1.11 6.21 
1942 2.78 2.07 2.98 2.50 10.33 
1943 2.27 2.28 2.16 2.06 8.77 
Se ee ee 
Sum 6.81 6.00 6.83 5.67 25.31 
———————————————— 
4000 
1941 1.43 1.30 0.18 1.66 4.57 
1942 2.51 1.48 1.83 2.36 8.18 
1943 1.54 1.46 0.16 1.84 5.00 
ee, Se 
Sum 5.48 4.24 2.17 5.86 17.75 
ee EE ee 
2000 
1941 1.24 0.70 0.69 0.82 3.45 
1942 3.29 2.00 1.38 1.98 8.65 
1943 2.67 1.44 1.75 1.56 7.42 
a eee 
Sum 7.20 4.14 3.82 4.36 19.52 
—————— EES EEE EE EE Ee ee eee 
None 
1941 0.79 0.21 0.01 0.16 1.17 
1942 1.70 1.44 2.65 2.15 7.94 
1943 1.62 1.26 1.36 1.87 6.11 
Sum 4.11 2.91 4.02 4.18 15.22 
oo re Be 
Block Sums 
1941 7.62 4.84 3.95 5.12 21.53 
1942 13.60 8.90 11.20 10.61 44.31 
1943 10.69 7.88 7.09 9.08 34.74 
Sum 31.91 21.62 22.24 24.81 100.58 


* Expressed as volume in board-feet of trees larger than 9.6 inches in diameter, which 
were left in the forest after treatment. 


17 


Table 2. Analysis of variance, Soil moisture experiment 


Source of variation 


Degrees of Mean 


Pure variances contained in mean square 


freedom square g3, 5%, s%y sem sty 8try _8S"tby 
Total 59 0.5577 
(Between plots) (19) 
(t) Treatments 4 1.3333 12 3 4 1 
(b) Blocks 3 1.4832 15 3 5 1 
(tb) Treatments x 
Blocks: 12 0.3909 3 1 
“Place” error 
(Within plots) (40) 
(y)- Years 2 6.5418 20 4 5 1 
(ty) Treatments x 
years: “Time” error 8 0.2554 4 1 
(by) Blocks x years 6 0.1294 5 1 
(tby) Triple interaction 24 0.1053 1 


place in the population sampled. And in the 
second part, the mean square for “years” and 
the interactions associated with years may be 
compared with a second error term, provided 
by the triple interaction. 


For the purpose of our present experiment, 
the most interesting feature of the “with- 
in plots” analysis is the interaction of treat- 
ments with years (time error, Table 2). Since 
this mean square estimates the failure of the 
treatments to behave alike from year to 
year, it provides an error term in the time 
dimension which, like the treatment-block in- 
teraction, may be used to test the significance 
of treatment effects. 


We have outlined the general form of 
analysis for this kind of experiment, have 
partitioned it into its appropriate comparisons, 
and have calculated a mean square for each. 
With obvious minor changes, the same prin- 
ciples may be applied to other experimental 
designs. Before we proceed to test the signif- 
icance of treatment effects, however, we must 
consider one further complication, which may 
be clarified by inspection of the internal struc- 
ture of the mean squares obtained in our 
analysis of variance. As shown by Fisher (3, 
Sec. 40) and elucidated by Tippett (5) and 
Winsor and Clarke (6), each mean square con- 
tains one or more “pure” variances, each esti- 
mating a single component of variation such 
as the effect of treatments or the interaction 
of treatments with blocks; and each variance 


is included one or more times in the mean 
square, the number of times being equal to 
the number of observations contained in the 
sums from which the mean square is com- 
puted. Thus the triple interaction contains 
only a single variance, taken only once be- 
cause this mean square was calculated from 
single observations. And the treatment-block 
interaction, in our analysis, contains the triple 
interaction variance and also the pure treat- 
ment-block variance; the latter is included n 
times in the mean square, with n equal to the 
number of years contained in each of the 
20 sums from which this mean square was 
calculated. Hence the treatment-block mean 
square may be expressed as 


Viv = nse + 8 tvy 


where s* is an estimate of a pure variance, and 
the subscripts indicate the character of the 
mean square or variance. Similarly, the 
treatment-year mean square may be stated 
as 


Vey => qs*ty a S*tby 


where g is the number of blocks contained in 
each sum from which this mean square was 
calculated. And, as a final example, the 
treatment mean square contains a series of 
variances: that due to the pure effect of the 
treatments, and the pure variances which esti- 
mate the effects of all the interactions as- 
sociated with treatments: 


Ve qns": a Ds"tp + qs" ty + S*toy 


Expressed in terms of the particular ex- 
periment under discussion (with five treat- 
ments, four blocks, and three years), these 
equations become 

Vin = 357 eb + S* toy 

Vey = 4s" ty + S*tby 

Ve = (4) (3) 5% + 3e%en + 4e%ey + Ss" toy 
In these quantitative terms, the contents of 
all the mean squares in our analysis are 
presented in the right half of Table 2. 


Assuming that each of the pure variances 
in the above three equations is significantly 
greater than zero, each contributes a real 
amount to the several mean squares. This 
fact brings us to the main point of our dis- 
cussion: either of our two error terms for 
testing treatment effects (Vin or Vty) con- 
tains only two components of error, while the 
treatment mean square contains four—noi only 
the pure variance associated with treatment 
and those contained in either one of the 
error terms, but also an additional interaction 
variance which is not directly associated with 
the treatment effects to be tested, but is a 
part of the other error term. In itself this 
fact does not invalidate our tests; but it does 
call for one further step in analysis. When 
we have proceeded this far, we can ask two 
alternative questions (keeping in mind the 
need for conservatism imposed by our initial 
assumption that these blocks and years ac- 
tually represent a random sample of popula- 
tions in space and time) : 


1. In years that are similar to 
those included in our experiment, are 
the treatments likely to exert a real 
effect if applied in other blocks in 
the population of which our blocks 
are a sample? and 


2. On the average for blocks that 
are generally similar to ours, are the 
treatments likely to exert a real effect 
if applied in other years contained in 
the time population of which our 
three years are a sample. 


In order to answer the first question, 
we calculate the values estimated by our ex- 
periment for each of the several pure variances 
(see Table 2); subtract the treatment-year 
component (qs*ty) from the treatment mean 
square; and test the residual mean square by 
comparison with the treatment-block mean 


19 


square. The second question requires a simi- 
lar procedure, except that we subtract the 
treatment-block component (ns*t») and use 
the treatment-year mean square as the error 
term. 


For the present experiment, these pro- 
cedures may be expressed quantitatively as 
follows: 

1) Ver =12s", + 387ep + S"tby 
= Vi = 4s", 
= 1.3333 — 0.1501 
= 1.1832 
Ve «1.1832 


Thence F’= 


=3.03; and 


Ven 0.3909 


2) Vue = 12s + 4s", + S*eby 
= Vi — 3s"tb 
= 1.3333 — 0.2856 
= 1.0477 


Ve ~=—«1.0477 
——_—_—_=4,10, 


0.2554 


Thence F’= 


Vey 


These calculated values for F’ and F” will not 
follow the mathematical distribution of F 
exactly; and the number of degrees of free- 
dom which should be associated with the 
two comperisons has not yet been clarified. 


Estimates of significance obtained from 
these two tests should not be much in error, 
however, if they are based on somewhat fewer 
degrees of freedom than those indicated in 
the analysis. 


The steps outlined above are, of course, un- 
necessary if the treatment-year (or treatment- 
block) mean square proves not to be signifi- 
cantly greater than the triple interaction, so 
that the associated pure variance does not 
contribute a significant amount to the treat- 
ment mean square. 


If desired, one more analytic step may be 
taken in order to derive a maximum quantity 
of information from the analysis. Where- 
ever the treatment-block or treatiuent-year 
Mean square ind ‘ates a real interaction of 
the treatments wth either place or time, 
the above procedw :s may be usefully supple- 
mented by the eparate analysis of each 
block or year, ac ompanied by scrutiny of the 


data and of any characteristics peculiar to 
each plot and year. The purpose of such 
analysis is to explain, if possible, the nature 
of the interaction and the reasons for its 
significance. In the experiment under dis- 
cussion, for example, the effects of timber cut- 
ting were found to fluctuate significantly from 
year to year; and these fluctuations were found 
to depend largely on the amount and distribu- 
tion of the antecedent summer rainfall which 
reached the soil through the forest canopy 
remaining after treatment. 


As indicated in our introductory dis- 
cussion, the variation between years and the 
interaction of treatments with years may often 
contain a systematic as well as a random com- 
ponent, the former expressing a cumulative 
effect of past treatments or perhaps only 


—148. 


he > 32: 

hy eee re 

Snedecor, G g. W. Statistical methods. 
Ti Laan 


P. and G. L ‘ke. 


pate, F. 
1937. 


coageamwgeole 


1938. 


a carry-over effect of the preceding year’s 
treatments. In such cases, as outlined by 
Cochran and others (1, 2, 8), the systematic 
component may be isolated by covariance anal- 
ysis. In this additional step, the sums of 
squares connected with years are broken down 
into components that are associated with and 
independent of the regression of treatments on 
some factor which expresses the systematic 
time effect. Where a cumulative effect is sus- 
pected, the logical procedure is to fit one or 
more terms of the regression of treatments on 
years, employing the method of “orthogonal 
polynomials” perfected by Fisher (4, Sec. 
14.6). And in cases where only a carry-over 
effect is suspected, the experimental results 
may be adjusted by a regression of the cur- 
rent year’s results on those of the preceding 
year. 


Sepa ee G. Long-term agricultural experiments. Supp. Jour. Royal Stat. Soc. 2: 104 
Crowther, a tie W. G. a ete experiments with cotton in the Sudan Gezira. 
* St tistical methods for reasearch workers. Oliver and Boyd, Edinburgh. 


Iowa State hae ‘e Press, Ames, 
C.H. ‘The es of eee Williams 


Ed. 3. 1940. 
Ed. 2, 1937. 


Iowa. 
ate Ltd., ‘London. 


Nor, 
eal eM study of int varintiaes in the catch of plankton 
nets. <3 Foundation: Jour. Marine Res. 


3: 1 


1940. 
The design and analysis of factorial experiments. Imp. Bur. Soil Sci. Tech. Com. 
Fates F. and W. G. Cochran. The analysis of groups of experiments. Jour. Agr. Sci. 
28: 556—580. ni es 


THE GEOGRAPHICAL DISTRIBUTION OF GENES DETERMINING 


INDIVIDUAL HUMAN BLOOD DIFFERENCES 


Puitip Levine, M.D., F. A. C. P. 
Ortho Research Foundation 


The physical anthropologist employs ra- 
cial characteristics, the heredity of which is 
most uncertain. The serologist has the great 
advantage of studying blood properties which 
are inherited according to simple mendelian 
laws. This holds true especially for the blood 
properties A and B (the four blood groups) 
and the factors M and N. A considerable im- 
petus to the study of the geographical distri- 
bution of blood properties was supplied re- 
cently by the discovery of the Rh factor and 
its importance in a specific form of selective 
fetal and neonatal morbidity (erythroblastosis 
fetalis). 


20 


The accepted theory of the heredity of 
the AB system was discovered by Bernstein 
(a mathematician who never carried out any 
blood tests) by the analysis of gene fre- 
quencies for several racial (geographic) groups 
characterized by varying incidence of the four 
blood groups. 


Much has been contributed to the early 
history of man by geographic studies of only 
one set of three allelomorphic genes (O, A 
and B). The two factors most important in 
explaining the varying distribution of the 
blood groups are first isolation and then mix- 


ure. In the very early history of man, the 
solation of very small groups with the loss 
f the rarer gene for the factor B and the oc- 
asional loss of the more frequent gene for the 
actor A resulted in populations which are puz- 
ling to the anthropologist, i.e., identical ra- 
ial group with striking differences in the in- 
idence of the A and B factors and entirely 
lifferent races with very similar distribution of 
he blood groups. These serve as excellent 
xamples of Wright’s mechanism of gene fixa- 
ion, the geographic distribution of which is 
argely or entirely accidental. 


Even the most striking discrepancies in 
ny race observed in the studies with the 
actors, A and B may be reconciled by the find- 
ngs with other blood factors, for it is improb- 
ible that gene fixation in small isolated groups 
vill effect more than one gene at atime. Thus, 
evine and Matson have shown that both 
ribes of American Indians, one having a high 
ncidence of group O, and the other with a 
reponderance of group A, have equally high 
alues for the factor M and both tribes have 
qually low incidence of taste-blindness to 
yara-ethoxy-phenyl-thio-urea. 


Although systematic studies are lacking, 
here is considerable information with regard 
o the geographic distribution of the sub- 
roups of group A and factor P of Land- 
teiner and Levine. Thus, American Indians 
laving a high incidence of group A are almost 
xclusively Ai, a random white population of 
New York City has 5 or 6 times as much A; 
is Az while Negroes in the same area have 
omewhat less A: than As The factor P is 
auch more frequent in Negroes than in white 
ndividuals. These studies are significant even 
hough these factors are not as well defined 
erologically as the factors A, B, M, N or Rh. 


Investigations on the Rh factor have al- 
eady revealed its significance in racial rela- 
ionships. Its role in the pathogenesis of ery- 
hroblastosis fetalis is now well estabilshed. 
‘he view has been expressed that the Rh—rh 
ene is responsible for more fetal and neonatal 
leaths than any other gene difference known. 
Secause of the morbid effects specifically on 
cterozygous infants (Rhrh) ,the stability of the 


21 


Rh gene in any population is of censiderable 
importance especially in those races with a 
high incidence of rhrh individuals. The racial 
incidence of this disease is directly propor- 
tional to the value of rhrh individuals in any 
given population. Thus, corresponding to the 
value of Rh negative (rhrh) individuals in 
whites, Negroes and Chinese, 15%, 5 to 
8%, and 1% respectively, the disease is three 
times more frequent in whites than in Negroes 
and is almost entirely unknown among Chinese. 
This selection against the heterozygous and ac- 
tually against the less frequent recessive gene, 
which follows from Levine’s theory, has been 
discussed recently by many noted scientists. 
Most workers support the view that the genes 
are of comparatively recent origin so that 
they have not yet reached a state of equilib- 
rium. 


The current genetic theory is that the vary- 
ing types of Rh reactions (phenotypes) is deter- 
mined by a series of multiple allelomorphs. 
A complex antigenic and corresponding genetic 
structure is to be expected in any factor dem- 
onstrated by isoimmunization, in contrast to 
the simpler antigenic and genetic scheme of 
the MN system demonstrable by heteroimmuni- 
zation. Racial studies on a vast scale similar 
to those on A and B should yield much more 
significant data than that yielded by geo- 
graphic studies of M and N or perhaps even 
A and B. As with the four blood groups, the 
correct and final theory of the heredity of the 
Rh system will emerge from or be confirmed 
by statistical analysis of the distribution of 
the various Rh subtypes in many races. Such 
studies should be most valuable in spite of 
selection against the recessive type since sev- 
eral races are already known to have a very 
low incidence of Rh negative individuals 
(American Indians, Chinese, and Japanese). 


Selective fetal death induced by the factors 
A and B, or by any other factors like Rh 
capable of inducing isoimmunization of the 
mother by the fetus, must now be considered 
in genetic and geographic studies of all racial 
groups. 


| 


| 


GRADUATE WORK IN STATISTICS AT COLUMBIA UNIVERSITY — 


Since every statistical method has a great 
variety of possible applications, the traditional 
practice of teaching statistical methods as 
if they were branches of one or another 
of the applications is evidently doomed. The 
teaching arrangements prevailing in the past 
might be compared to the teaching of chem- 
istry, zoology, anatomy or bacteriology in an 
imaginary medical school as incidental ac- 
tivities of the departments concerned with 
the various kinds of disease, each department 
teaching as much of these sciences as it 
considered necessary for the treatment of its 
own disease. In such a medical school chem- 
istry might be taught in the Department of 
Cardiology by a cardiologist, and quite inde- 
pendently in the Department of Obstetrics by 
an obstetrician. Each department would see 
to it that its own instructor in chemistry really 
knew the disease to which that department was 
consecrated, but it would regard chemistry in 
general as a very minor concern. I]t would not 
object if its man happened also to be a decent 
chemist, provided he did not wander off 
into chemistry so far as to give the impression 
that his feet were not solidly on the ground. 
Such a school would do little for the advance- 
ment of chemistry. Its students would not 
have the benefit of chemical instruction as 
accurate, complete and modern as could be 
supplied by genuine specialists in chemistry. 

The advantages of specialization and di- 
vision of labor point clearly to the future 
teaching of statistics by specialists in the 
subject, a class devoted to the increase and 
diffusion of knowledge regarding statistical 
methods and theory. Only such specialists, 
removed from pressure to devote too much of 
their time to particular applications, can hope 
to concentrate sufficiently upon the subject- 
matter of statistical theory and method to 
purify it of its errors, explore and strengthen 
its foundations, build up its superstructure, 
and transmit it as a living body of knowledge 
rather than as an old and defective tool to 
the newer generation. Such a scholarly group 
must be organized around its own subject, 
statistics. 

While the future of statistical teaching will 
thus be in the hands of specialists, there are 


difficulties about the transition to that future. 
If all the colleges and universities in the 
country should now undertake to put into © 
effect the change indicated, they would find 
it impossible for the simple reason that there 
do not exist specialists in statistical theory — 
and methods in anything like sufficient num- 
bers. A necessary preliminary is the develop- 
ment of the speciclists. The problem of ob- 
taining competent scholars specializing in 
statistics for the college and university fac- 
ulties is made more difficult by the strong 
demand from industry and from government, 
as well as from research organizations of varied 
types, for individuals having practically the 
same type of preparation and ability. 
Columbia University has undertaken a pro- 
gram of graduate work in statistics designed 
for the preparation of scholars who will work 
on the highest levels. Graduate students may 
enroll for the Ph.D. degree in statistics under 
the supervision of an interdepartmental com- 
mittee. Each student’s work is arranged to 
include pure mathematics, mathematical sta- 
tistics, and a field in which statistical methods 
can be applied. The relative time allotted 
to study under each of these heads varies from 
individual to individual according to previous 
education and experience. All must go through 
a closely integrated series of courses in mathe- 
matical statistics, beginning with the logical 
and mathematical theories of probability and 
proceeding through the various major divisions 
of statistical theory. In these courses the main 
emphasis is on exact statement and careful 
derivation of statistical principles and methods, 
with the limitations of each method made clear 
in discussing the assumptions underlying the 
derivations. Attention is given to the prac- 
tical devices and approximations that have to 
be used in the absence of suitable agreement 
between empirical situations and the assump- 
tions underlying standard methods, or on ac- 
count of the inadequacy of existing mathe- 
matics. These courses also include diversified 
practical examples, and some training is given 
in computing and in careful writing. They 
enable the student to acquire a certain amount 
of facility in practical statistical work, and also 
bring him to the threshold of research by 


Pointing out the large number of unsolved 
problems that confront the statistician. 


In preparation for research, students are 
encouraged to include as much pure mathe- 
mathics as possible in their studies. A really 
good command of calculus is necessary be- 
fore beginning the curriculum in statistics, and 
elementary matrix algebra and theory of 
functions are studied in the first year by 
those who have not had them earlier. The 
Department of Mathematics also provides 
courses in more advanced mathematics of value 
in statistics, including among others finite dif- 
ferences and elementary and advanced differen- 
tial equations, the last involving some special 
functions used in statistics. 


Work in a field of application is particu- 
larly stressed in the case of students who have 
concentrated heavily on mathematics as un- 
dergraduates. The aim is not so much to de- 
velop statistical economists, psychologists, or 
biological research workers as to provide the 
statistician with the essential equipment for 
cooperating with specialists in at least one em- 
Pirical field, and for bridging the gap between 
them and mathematics. The chief aim of this 
part of the training is to make the mathe- 
matical statistician so definitely aware of the 
kind of situation facing the practical worker 
that the former will concentrate his ingenuity 
on the provision of tools of real value to the 
latter. The relation is felt definitely to be 
that of tool-maker and tool-usér; the maker 
must know well the uses of his tools in order 
to désign them efficiently. The actual fields 
of application chosen by students are diverse. 
One student selected life insurance and ac- 
tuarial work, another vital and public health 
statistics, others genetics, industrial quality 
control, economics, and other subjects. Study 
of these fields is sometimes in courses in 
the relevant university departments or some- 
times by individual guided study in the case 
of students who have a strong foundation in a 
particular field acquired in undergraduate 
study. In such a field as industrial quality con- 
trol, practical experience in the Bell Telephone 
Laboratories or other companies having well- 
organized quality control departments may 
meet the basic need. 


Most of the classes in mathematical sta- 


23 


tistics meet in the late afternoon or evening, 
and many of the students hold jobs, either 
full- or part-time, often of such a nature as 
to bring them into contact with practical prob- 
lems involving statistics. Still, candidates for 
the doctorate are warned that they will need 
at least one year, and probably more, of 
full-time graduate study at the university, in 
addition to study that can be carried on while 
holding a job. 


The classes in mathematical statistics in- 
clude many students who are not candidates 
for the doctorate in statistics, but are study- 
ing the subject for the sake of its applications 
to their major subjects or vocations. In normal 
times these courses are taken as electives by 
numerous undergraduates, and by students 
under the various graduate and professional 
faculties. There are groups of graduate stu- 
dents majoring in education and psychology, in, 
economics and business, in philosophy, mathe- 
matics, engineering, physics, chemistry and 
biology of various kinds, with a few in 
sociology, history and scattered subjects. All 
are admitted if they have the necessary mathe- 
matical prerequisites. 


Columbia University has no master’s de- 
gree in statistics as such. However many 
students working largely in mathematical sta- 
tistics receive M.A. degrees in economics, 
mathematics or other departments. 


The group working in mathematical sta- 
tistics at Columbia University engages in re- 
search in statistical theory and methodology. 
Its members alsc advise and collaborate 
with numerous members of the university 
faculty and others regarding the statistical 
aspects of their work and the design of their 
experiments. The membership of this group 
of faculty and students overlaps that of the 
Statistical Research Group, a university or- 
ganization dealing with war problems referred 
to it by the government. 


The interdepartmental committee in 
charge of the program for candidates for the 
degree of Doctor of Philosophy in Statistics 
consists of Dean George B. Pegram and Pro- 
fessors F. E. Croxton, R. S. Lynd, F. C, 
Mills, J. F. Ritt, Abraham Wald, Helen M. 
Walker, and Harold Hotelling (chairman). 


NEWS AND NOTES 


The Bureau of ships has three biomet- 
ricians from Cornell plotting its course. Lt. €. 
McC. Mort ey, formerly Associate Professor of 
Limnology and Fisheries, is head of the Op- 
erational Analysis Sub-section with the Quality 
Control Section of the Research and Standards 
Branch of the Bureau in Washington. With 
him is Lr. Watrer C. Jacos, formerly research 
assistant in the Department of Vegetable Crops 
(seafood), New York State College of Agri- 
culture. He spent some time as an admini- 
strative officer at a Naval Training School in 
Richmond before coming with the Bureau 
where he is applying statistical methods in 
the development of materials for naval use. 
Also engaged in this work is Lr. Dantex R. 
Emsopy. He was an instructor in Limnology 
and Fisheries and spent a year at sea before 
he joined this group. 


Marton M. SanpomirE, who was statistical 
editor of the U. S. Department of Agriculture 
Publication Division, spent 20 months in The 
Inspector General’s Office, where she was in 
charge of the statistical aspects of surveys of 
automotive maintenance in the army. She 
has been with the Navy Department, Bureau 
of Ships, for the past year, applying statistical 
methods to research problems. 


In June, D. B. DeLury will join the stat- 
istical staff of the Virginia Polytechnic In- 
stitute and the Virginia Agricultural Experi- 
ment Station at Blacksburg. He received his 
degrees from the University of Toronto and 
spent a year of post doctorate work with Harold 
Hotelling at Columbia University. He has 
taught at Saskatchewan, University of Toronto 
and has been statistical consultant for several 
government agencies. A generous grant from 
the General Education Board has made pos- 
sible the expansion of the statistical work 
at Blacksburg. 


Associate Director F. R. Immer has re- 
turned to St. Paul from one of those secret 
missions. During the trip he talked with R. A. 
FisHer, who now has the chair of Genetics 
in Cambridge University. A direct report 
says that F. R. Immer returned with a couple 
of very good stories for our reunion at the 
next Biometrics Section meeting. With so many 
section members abroad, things should be go- 
ing well over there! A. E. Branpr reports 
“While I was in England fighting the cold, 
the fog, and the smoke—.” And we thought 


24 


he was helping with the war! To this, add | 
the fact that his chief recreation has been ~ 
wild boar hunting. One wonders—until it 
is noted that he met Frank Yates and Fred- 
erick F. Stephan in Paris. Official approval 
was granted for the report that FRANK YATES 
has a war job as statistical advisor to a 
general scientific advisor to an important 
military man!—an important job which has 
an immediate bearing on operations. It seems 
to be no secret, for it was in print, that FREp- 
ERIcK F. STEPHEN is in Paris to participate 
in a special bombing survey for the Air Trans- 
port Command ... In the usual confidential 
manner it can be stated that Rensis LIKERT is 
in the European Theater of Operations on a 
special research assignment for the War De- 
partment ...I£ anyone knows where W. J. 
YOuDEN is now, keep it a secret! A log of 
his activities since those 7:00 A.M. classes at 
Iowa State during the summer of 1942 may be 
expected at some future date. D. B. DeLury 
especially wishes a report on who gets him 
up in the mornings . . . During the absence 
of Churchill Eisenhart, statistician of the 
University of Wisconsin, J. H. Torrie is acting 
as statistical advisor for the Agricultural Ex- 
periment Station. He has been giving courses 
in statistical methods and in experimental de- 
sign. DorotHy MosHeER, a _ graduate in 
Mathematics at the University of Wisconsin, 
has been appointed, assistant in the statistical 
office there... E. W. Linpstrom, head of 
Genetics Department at Towa State 
Colloge, is in Medellin, Colombia, South 
America, lecturing in the College of Agricul- 
ture and helping to initiate a research program 
in plant genetics . . . A report came last Jan- 
uary that R. A. FisHER was about to depart for 
India for several months. Does someone want 
to verify the report? JoHn WisHart, M. G. 
KENDALL, Braprorp Hit and others are work- 
ing for the war but not directly with military 
operations . . . ALAN E. Tretoar of the Uni- 
versity of Minnesota has joined the staff 
of the Statistical Research Group. You might 
look for him from Maine to Florida... 
Guten Burton, Coastal Plain Experiment Sta- 
tion, visited North Carolina State College 
recently. And during his stay the local re- 
search workers in genetics and plant breeding 
had a luncheon meeting with him. Since 
1941, J. Neyman has been director of the 
Statistical Laboratory at Berkeley, California. 
The Laboratory as a unit is busy on war work 
under the Applied Mathematics Panel. Four 


members of the Laboratory, Grorce B. Dant- 
zic, F. W. Dresco, Mark W. Eupry and 
Erich LEHMANN are with the services but 
hope to return as soon as circumstances per- 
mit ... Houry Fryer, statistician at Kansas 
State College, is on leave to serve with The 
Division of War Research at Columbia Univer- 
sity. When he wrote, he was “deep in the 
heart of Texas” but his family is in Tenafly, 
New Jersey. We don’t know where he is 
registered for Selective Service so consider that 
all above addresses may or may not be correct 


... We are being told repeatedly that im- 
portant advances in statistical theory and prac- 
tice have been worked out by the various 
groups. A member of one group states, “All 
of us have had plenty of new problems which 
could not help influencing our thinking and, 
in time, many novelties are likely to appear 
in the literature available to all”... How 
about your sending some news items—that 
is, if you want correct news about yourself 
reported! 


QUERIES 


QUERY What is the error to be used for test- 
ing the significance of treatments in this ex- 
periment? Six treatments were applied to 
laying hens in 12 cages each containing 10 
birds. Each treatment was given in two cages 
chosen at random. 


ANSWER The experimental error is the mean 
square, 575.9, for cages receiving the same 
treatment. The value, F = 1,074.5/575.9 — 
1.87, with degrees of freedom ni=5 and nz 
= 6, is small as compared to the 5% point, 
4.39. Hence, in the absence of any further in- 
formation, one concludes that these treatments 


with different treatments; hence, the large 
experimental error. 

The testing of significance in a table like 
that above is made clearer to some by sub- 
division of the mean squares into parts repre- 
senting the three sources of variation assumed 
present. First, there is the variation in egg 
production by individuals occupying the same 
cage, estimated by s? = 297.8. For the pur- 
poses of this answer, this will be taken as com- 
mon to all cages. Next, there is the average 
variation of production in pairs of cages 
treated alike, estimated by s:*. The mean 
square for cages receiving the same treatment 


Analysis of variance of numbers of eggs laid 


nave Degrees of Sum of Mean 

Source of variation 
freedom squares square 
Treatments 5 5,372.3 1,074.5 
Cages receiving same treatment % 6 3,455.6 575.9 
Hens in same cage 108 32,164.7 297.8 


may have no effect on the production of eggs. 

The analysis of variance contains some 
additional information. Apparently there were 
inequalities in the environments of the cages. 
Evidence is found in the ratio, F =575.9/ 
297.8 = 1.93, df=6 and 108, which is just 
below the 5% point, 2.19. This leads one to 
suspect that such things as light, humidity, air 
currents, temperature and incidence of com- 
municable diseases may have affected egg pro- 
duction. If these things differentiated pro- 
duction in cages receiving the same treatment, 
they undoubtedly affected production in cages 


is the sum, s* + 10s;*, 10 being the number 
of birds per cage (Fisher’s “Statistical Meth- 
ods,” section 40. Snedecor’s “Statistical 
Methods,” section 10.14. Winsor and Clarke, 
“A Statistical Study of Variation in the Catch 
of Plankton Nets,” Journal of Marine Research, 
3:25-27). Finally, the mean square for treat- 
ments may have the additional component, 
sr*, With the coefficient, 20, the number of hens 
Teceiving each treatment. 

The three sample values of s? are each 
estimates of variances, ¢’, in the sampled popu-- 
lations. The following table summarizes the 
argument: 


Mean 
square 


Source of variation 


Individuals 
per group 


Mean square is 
an estimate of 


i 


Treatments 1,074.5 20 o* + 100;* + 2000" 
Cages 575.9 10 o* + 1001" 
Individuals 297.8 1 Cm 


i 


Now, F tests some null kypothesis; for 
example, that. one of the o’s is zero. The 
significance of differences among treatment 
means is determined by testing the hypothesis, 
or =O, in the ratio, 


o* + 1001* + 200° 
o + 100;° 


It is clear that if ox is zero, then this ratio 


is 1. The corresponding experimental value 
of F is 


1,074.5 


Nes == 1.87 


575.9 


The question asked in the test of significance 
is this: what is the probability of a greater 
excess over 1 in random sampling from a 
population in which or*?=0O? Comparison 
with the tabular value of F shows this prob- 
ability to be ‘considerably greater than 0.05; 
that is, there is little evidence against the hy- 
pothesis. 


The other null hypothesis tested above is 
that or =O in the variance ratio, 


o* + 100° 
on ae 
for which F = 575.9/297.8 = 1.93, as before. 


GrorcE W. SNEDECOR 


6 
QUERY Eight treatments were compared in 


a latin square field plot arrangement. The 
analysis of variance was: 


Source of variation 


According to the F-test, there is no sig- 
nificant difference; yet according to the t-test, 
there is a highly significant difference. What 
interpretation is possible under this situa- 
tion? 


ANSWER It is not informative to compare 
the two tests that you have made, The 
F-test is exact, the null hypothesis tested being 
this: The 8 treatment means are randomly 
drawn from a single normal population. The 
probability of a larger F from the hypothetical 
population is about 0.11. 


The hypothesis tested by t is that the 
two means are randomly drawn from a normal 
population. Since you selected the largest and 
smallest means for comparison, the tabulated 
probabilities of t are not applicable. Fisher 
has suggested a method for using the available 
tables for your purpose (Design of Experi- 
ments, section 24). The difference you have 
chosen is 1 of 28 comparisons that might be 
made among the 8 treatment means. It is pro- 
posed, therefore, that the probability to be 
required of the selected difference be not 1 
in 20 but 1 in (28) (20)=560. Since the 
probability of your value of t is more than 
1 in 560 (it is about 1 in 460), the conclusion 
based on this t-test is about the same as that 
reached from the F-test. 


There is another test which is often use- 
ful: the range from the lowest mean to the 
highest, 17.3 — 12.7 = 4.6 tons, may be 
compared to the range expected in samples of 
8. This may be approximated from Egon 


Degrees of freedom Mean square 


Treatments 
Error 
B= 1,95 


7 15.39 
42 7.89 
5% F = 2.24 


The treatment means were 14.0, 17.3, 14.4, 
13.3, 14.0, 12.7, 13.9 and 13.6 tons. The 
greatest difference is between treatments 2 
and 6, 4.6 tons, the standard error of this dif- 
ference being 1.404 tons. This makes t =3.28 
a highly significant value. 


26 


S. Pearson’s table A in Biometrika, 24 (1932), 
page 416. There it is shown that the range 
4.29 o, will be exceeded in 5 percent of 
random samples of 8 drawn from a normal 
population with standard deviation o. Using 
the sample estimate of ¢, \/7.89/8 = 0.99 ton, 


the 5 percent range is (4.29) (0.99) = 4.25 
tons. Thus, the sample range slightly exceeds 
the 5 percent point. 


Summarizing: the statistical evidence is 
that under the hypotheses tested neither F 
nor t is as unusual as 1 in 20 but that the 
Tange exceeds the 5 percent point. As for 
interpretation, that must rest on biological 
considerations. Is there good reason for 
thinking that treatment 2 is outstanding 
despite the fact that apparently you had 
not suspected it? Is this in accord with 
other evidence? This treatment may be a 
genuine “find” which would call for more 
critical experimentation. On the other hand, 
the unusually high yield may turn out to 
be an experimental freak. 

Grorcre W. SNEDECOR, 
Towa State College. 


QUERY in comparing the reactions of soil 
samples should pH or the concentration of 


the hydrogen ion be used in testing the signifi- 
cance of the difference between treatments? 


ANSWER So far as statistical convenience is 
concerned choose the variate that most nearly 
conforms to the mathematical model of the 
experiment. For example, if two or more 
groups of samples are compared, it is pleasant 
to have the variate distributed normally with 
equal variances in the groups. If regression 
is involved, the easiest relation to handle is 
the linear with equal variances for all values 
of x. For randomized blocks, one would like 
the deviations of the variate in the several 
plots to be randomly assorted and normally 
distributed. 

In my experience, the pH scale has most 
often met both the biological and statistical 
specifications. 


Grorce W. SNEDECOR, 
Iowa State College. 


ABSTRACTS 


(10) 


LUSH, Jay L. (lowa State College). Chance as 
a Cause of Changes in Gene Frequency Within 
Pure Breeds of Livestock. 


The N individuals reaching breeding age 
in each generation are a sample of two N 
gametes from the preceding generation. These 
N individuals in turn are the universe from 
which another sample of two N gametes (if 
the population is constant in size) are taken 
to constitute the next generation. The pro- 
cession of the generations is statistically the 
sampling of a sample from a sample, from a 
sample, etc. If the probability of becoming a 
parent of an animal to reach breeding age in 
the next generation were uniform for all those 
which reach breeding age in the parental gen- 
eration, the variance of gene frequency (q) due 
to chance would be q(1-q) in one generation 


2N 


and t times as much in t generations, except as 
q(1-q) becomes damped down when q ap- 
proaches zero or 1.0. 


Actually the probability of achieving 
parenthood varies widely among those reaching 
breeding age in the parental generation. This 
makes the chance changes in gene frequency 
much larger than this formula indicates. 


27 


Among the important causes for this wide 
variation in number of descendants are: (1) 
sterility of some members of the parental gen- 
eration, (2) correlation between the fates 
of relatives, (3) many are sold into grade 
herds where they have no chance to contribute 
to the future pure breed, (4) functional 
stratification of purebred herds into a few herds 
(“centers of radiation” they would be called in 
discussions of evolution) which sell mainly 
to other purebred herds which have for their 
main business producing sires for use in 
commercial herds, (5) fame of a few sires 
and dams to such an extent that nearly all 
their close relatives are eagerly sought for 
use in other purebred herds, while the rela- 
tives and descendents of the many sires and 
dams which do not achieve this fame go largely 
into commercial herds or into the purebred 
herds which sell to the commercial herds. 

Several examples of how much gene 
frequency may have varied by chance because 
of the extensive use of one or a few breed- 
ing animals are cited. A summary of the 
studies of inbreeding in 17 pure breeds of 
livestock indicates an average increase of about 
0.4 to 0.6 percent per generation in the inbreed- 
ing coefficient. This is equivalent to values of 80 
to 130 for N in the preceeding formula and 
would give the chance oq in one generation a 
value of 0.3 to .04 when q is near .5. 


(Continued from page 15) 
tained at little extra cost. The y’ test (8) is 
the usual method for determining the agree- 
ment between observed ratios and those ex- 
pected on the basis of some genetic hypothesis. 
Possible linkage between the factors for two 
character pairs may be tested by calculating 
* for independence. If the two ratios appear 
to be linked, the next step is to measure 


The BIOMETRICS BULLETIN is published 
six times a year for its Biometrics Section by 
the American Statistical Association, 1603 K 
Street, N. W., Washington 6, D. C.; President 
Water A. SHEwHaRT; Secretary-Treasurer 
and Managing Editor, Lester S$. KELuocc. 


Officers of the Biometrics Section: C. I. 
Buss, Chairman; H. W. Norton, Secretary. 

Editorial Committee for the BIOMETRICS 
BULLETIN: Gertrupe M. Cox, Chairman, 
C. I. Bliss, W. G. Cochran, F. R. Immer, J. 
Neyman, H. W. Norton, L. J. Reed, CG. W. 
Snedecor, Sewall Wright. 


Membership dues in the Biometrics Section 
and the American priate! Association com- 
bined are $6.00 per year, including the JOUR- 
NAL OF THE PAMERICAN STATISTICAL 


28 


able (13). 
Fs as well, the method of maximum likelihoos 
(8, 15, 16) is used to estimate the recombina 
tion percentage which best satisfies all avail 
able data. 


ASSOCIATION, the BIOMETRICS BULL 
ETIN, and the ASA BULLETIN; for Associat 
Members of the Section, dues are $2.00 pe 
year, which includes the BIOMETRICS BUL 
LETIN. Single copies are 60 cents and annual 
subscriptions $3.00 


Subscriptions and applications for memb 
ship should be sent to the American Statisti- 
cal Association, 1603 K Street, N. W., Was 
ington 6. D. C. 


Material for the BULLETIN should 
addressed to the Chairman of the Editoria 
Committee, Instute of Statistics, North Car 
olina State College, Raleigh, N. C., te 
“Queries”, Statistical Laboratory, Iowa Stat 
College, Ames, Iowa, or to any member 
the committee. 


